Model-Powered Conditional Independence Test

نویسندگان

  • Rajat Sen
  • Ananda Theertha Suresh
  • Karthikeyan Shanmugam
  • Alexandros G. Dimakis
  • Sanjay Shakkottai
چکیده

We consider the problem of non-parametric Conditional Independence testing (CI testing) for continuous random variables. Given i.i.d samples from the joint distribution f(x, y, z) of continuous random vectors X,Y and Z, we determine whether X ⊥ Y |Z. We approach this by converting the conditional independence test into a classification problem. This allows us to harness very powerful classifiers like gradient-boosted trees and deep neural networks. These models can handle complex probability distributions and allow us to perform significantly better compared to the prior state of the art, for high-dimensional CI testing. The main technical challenge in the classification problem is the need for samples from the conditional product distribution f(x, y, z) = f(x|z)f(y|z)f(z) – the joint distribution if and only if X ⊥ Y |Z. – when given access only to i.i.d. samples from the true joint distribution f(x, y, z). To tackle this problem we propose a novel nearest neighbor bootstrap procedure and theoretically show that our generated samples are indeed close to f in terms of total variational distance. We then develop theoretical results regarding the generalization bounds for classification for our problem, which translate into error bounds for CI testing. We provide a novel analysis of Rademacher type classification bounds in the presence of non-i.i.d near-independent samples. We empirically validate the performance of our algorithm on simulated and real datasets and show performance gains over previous methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Test of Significance for Conditional Independence: The Multinomial Model

Conditional independence tests have received special attention lately in machine learning and computational intelligence related literature as an important indicator of the relationship among the variables used by their models. In the field of probabilistic graphical models, which includes Bayesian network models, conditional independence tests are especially important for the task of learning ...

متن کامل

Distribution-Free Learning of Graphical Model Structure in Continuous Domains

In this paper we present a probabilistic non-parametric conditional independence test of X and Y given a third variable Z in domains where X, Y , and Z are continuous. This test can be used for the induction of the structure of a graphical model (such as a Bayesian or Markov network) from experimental data. We also provide an effective method for calculating it from data. We show that our metho...

متن کامل

Predictive Independence Testing, Predictive Conditional Independence Testing, and Predictive Graphical Modelling

Testing (conditional) independence of multivariate random variables is a task central to statistical inference and modelling in general though unfortunately one for which to date there does not exist a practicable workflow. State-of-art workflows suffer from the need for heuristic or subjective manual choices, high computational complexity, or strong parametric assumptions. We address these pro...

متن کامل

Conditional Mean and Quantile Dependence Testing in High Dimension

Motivated by applications in biological science, we propose a novel test to assess the conditional mean dependence of a response variable on a large number of covariates. Our procedure is built on the martingale difference divergence recently proposed in Shao and Zhang (2014), and it is able to detect certain type of departure from the null hypothesis of conditional mean independence without ma...

متن کامل

Testing Conditional Independence for Continuous Random Variables

A common statistical problem is the testing of independence of two (response) variables conditionally on a third (control) variable. In the first part of this paper, we extend Hoeffding’s concept of estimability of degree r to testability of degree r, and show that independence is testable of degree two, while conditional independence is not testable of any degree if the control variable is con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017